PARTIAL TRANSPOSITION OF RANDOM STATES AND NON-CENTERED 

SEMICIRCULAR DISTRIBUTIONS 



GUILLAUME AUBRUN 

Abstract. Let W be a Wishart random matrix of size d 2 x d 2 , considered as a block matrix with d x d 
blocks. Let Y be the matrix obtained by transposing each block of W . We prove that the empirical 
eigenvalue distribution of Y approaches a non-centered semicircular distribution when d — > oo. We also 
show the convergence of extreme eigenvalues towards the edge of the expected spectrum. The proofs are 
based on the moments method. 

This matrix model is relevant to Quantum Information Theory and corresponds to the partial trans- 
position of a random induced state. A natural question is: "When does a random state have a positive 
partial transpose (PPT)?". We answer this question and exhibit a strong threshold when the parameter 
from the Wishart distribution equals 4. When d gets large, a random state on C ® C d obtained after 
partial tracing a random pure state over some ancilla of dimension ad 2 is typically PPT when a > 4 and 
typically non-PPT when a < 4. 



1. Introduction 

In the recent years, several connections were established between Random Matrix Theory and Quantum 
Information Theory. It turns out that random operators, and the random constructions they induce, can 
be used to construct quantum channels with an unexpected behavior, violating some natural conjectures 
(the most prominent example being Hastings's counterexample to additivity conjectures Random 
matrices appear to be a sharp tool in order to understand the high-dimensional objects from Quantum 
Information Theory. 

In this spirit, we study here a model of random matrices motivated by Quantum Information Theory. 
The model is simple to describe: start from Wishart n x n random matrices, which is the most natural 
model of random positive matrices. Assume that their dimension is a square (n = d 2 ). These matrices 
can be considered as block-matrices, with d 2 blocks, each block being a d x d matrix. Now our model is 
obtained by applying the transposition operation inside each block. A equivalent formulation is to consider 
d 2 x d 2 matrices as operators on the tensor product of two d-dimensional spaces, and to apply to them 
the partial transposition Id (g) T, where T is the usual transposition. 

For this model, the empirical eigenvalue distribution converges towards a non-centered semicircular 
distribution, and the extreme eigenvalues converge towards the edge of the spectrum. These results were 
observed numerically by Znidaric et al. [23J. The aim of the present paper is to give a complete proof of 
these facts. We rely on a standard tool from Random Matrix Theory: the method of moments. 

The fact that the limiting distribution is semicircular is not a complete surprise. In the context of 
free probability, semicircular distributions are the non-commutative analogue of Gaussian distributions, 
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and therefore one expects their appearance in limit theorems. For example Wigner's celebrated theorem 
identifies the centered semicircular distribution as the limit distribution of eigenvalues of random Hermitian 
matrices. However, other limiting distributions do appear in the theory: for example, the Wishart matrices 
themselves (i.e., without the partial transposition) converge to the so-called Marcenko-Pastur law (see 
section 12 . 3|) . Moreover, our model brings some additional exoticism since the limiting distribution is 
non-centered. 

Since the transposition is not a completely positive map, there is no reason a priori for matrices from 
our model to be positive. However, we show that for some range of the parameters, partially trans- 
posed Wishart matrices are typically positive. A threshold occurs when the parameter from the Wishart 
distribution equals 4. 

The partial transposition appears to play a central role in Quantum Information Theory and is closely 
related to the concept of entanglement. An important class of states is the family of states with a Positive 
Partial Transpose (PPT). Non-PPT states are necessarily entangled |22j and this is the simplest test to 
detect entanglement. Let us simply mention a related important open problem known as the distillability 
conjecture |13j : it asks whether, for a state p, non-PPT is equivalent to the existence of a protocol which, 
given many copies of p, distills them to obtain Bell singlets — the most useful form of entanglement. A 
positive answer to the distillability conjecture would give a physical meaning to partial transposition. 

The model of Wishart random matrices has also a physical interpretation in terms of open systems: 
assume the subsystem C rf <S> C d is coupled with some environment C p . If the overall system is in a random 
pure state, the state on C d <%> C d obtained by partial tracing over C p is distributed as a (normalized) 
Wishart matrix. Early notable works about entanglement of random states include |16] and |T2j- Our 
results can be translated in this language. In particular, a random induced state is typically non-PPT 
when p/d 2 < 4 and is typically PPT when p/d 2 > 4. This shows that a threshold for the PPT property 
occurs at p = Ad 2 . 

Organization. The paper is organized as follows: Sections I2H7] are written in the language of Random 
Matrix Theory and contain the proof of our theorems. Section[2]introduces the model and states TheoremQ] 
(convergence towards the non-centered semicircle distribution) and Theorem |2] (convergence of the extreme 
eigenvalues). Section [3] reminds the reader about non-crossing partitions and the combinatorics behind 
the moments method for Wishart matrices, on which we rely heavily. Section |4] shows how to derive 
Theorem [1] from moment estimates ; the proof of these estimates (the heart of the moments method) is 
deferred to Sections [5] and [6] Section [7] contains the proof of Theorem |2j Section [8] connects to Quantum 
Information Theory. Section [9] contains some general remarks and possible variations on the model. A 
high-level non-technical overview of the result of this paper and of a related article [3] can be found in [3]. 

2. Background and statement of the main theorem 

2.1. Conventions. By the letters C,Cq,c, . . . we denote absolute constants, whose value may change 
from occurrence to occurrence. The integer part of a real number x is denoted by \x\ . We denote by [k] 
the set {1, . . . , k}. Addition in [k] is understood modulo k. We denote by a,b,c, . . . multi-indices which 
are elements of N fc for some integer k. The coordinates of a are denoted (a±, . . . , a&). 

When a £ N fc , we denote by #a the number of distinct elements which appear in the set {a\, . . . , afc}. 
For example, #(1,4,1,2) = 3. The cardinality of a set A is denoted card A. The notation 1e denotes a 
quantity which equals 1 when the event E is true, and otherwise. 

By 1 1 A | |oo or simply ||^4|| we denote the operator norm of a matrix A. 
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2.2. Semicircular and Marcenko Pastur distributions. Let m G R and a > 0. The semicircular 



m 



distribution with mean m and variance a is the probability distribution ^sc(m,a 2 ) with support 
2a, m + 2<t] and density 

dVSC(m,a2) 1 rp^ p To 

T = o — 9 V 4cr - (a - m) 2 . 

(XX 27T(X 

It is well-known (pQ, page 7) that if X is a random variable with SC (0,1) distribution, the moments of 
X are related to the Catalan numbers Ck = jfrj ) ; 

BX 2k = C k , EI M+1 = 0. 

We now introduce the Marcenko-Pastur distributions. First, for < a ^ I, let f a be the probability 
density defined on [&_, 6+] (where b± = (1 ± -v/ck) 2 ) by 



2irxa 

The Marcenko-Pastur distribution with parameter a, fMMP(a)t * s the following probability distribution 

• If a 1, then HMP(a) 1S the probability distribution with density 

• If < a ^ 1, then ^/WMP(a) ( x ) = ( x — a )^0 + ctdf a (x), where 5o denotes a Dirac mass at 0. 

In particular, note the following fact: if X has a semicircle SC (0,1) distribution, then X 2 has a 
Marcenko-Pastur MP(1) distribution. 

2.3. Asymptotic spectrum of Wishart matrices: Marcenko Pastur distributions. Define a 
(n,p)- Wishart matrix as a random n x n matrix W obtained by setting W = ^GG^ , where G is a 

n x p matrix with independent (real or complex3) N(0,1) entries. The real case and complex case are 
completely similar. Our results are valid for both, although only the complex case is relevant to Quantum 
Information Theory. 

Let Abe & nxn Hermitian matrix, and denote Ai, . . . , A n the eigenvalues of A. The empirical eigenvalue 
distribution of A, denoted Na, is the probability measure on Borel subsets of R defined as 



1 n 



n 

i=l 



In other words, Na(B) is the proportion of eigenvalues that belong to the Borel set B. For large sizes, 
the empirical eigenvalue distribution of a Wishart matrix approaches a Marcenko-Pastur distribution. 

Theorem (Marcenko-Pastur, |18]). Fix a > 0. For every n, let W n be a (n, [an\ ) - Wishart matrix. Then 
the empirical eigenvalue distribution of W n approaches a Marcenko-Pastur distribution MP(a) in the 
following sense. For every interval JcR and any e > 0, 

lim P(\N Wn (I) - v M p(a)(I)\ > e) = 0. 



complex-valued random variable £ has a complex N(0, 1) distribution if its real and imaginary parts are independent 
random variables with real A^(0, |) distribution. In particular, E |£| 2 = 1. 
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2.4. Partial transposition. We now assume that n = d 2 . One can think of any n x n matrix A as a 
block matrix, consisting of d x d blocks, each block being a d x d matrix. The entries of the matrix are 
then conveniently described using 4 indices ranging from 1 to d 

A = (^W 

Here i denotes the block row index, j the block column index, k the row index inside the block and I 
the column index inside the block (i, j). We can then apply to each block of A the transposition operation. 
The resulting matrix is denoted A r and called the partial transposition of A. Using indices, we may write 

(i) (A^ti = 4j- 

Such a block matrix A can be naturally seen as an operator on C d <g> C d . Indeed, a natural basis in this 
space is the double-indexed family (e^ ® ek)i^i,k^d, where (e^) is the canonical basis of C d . The action of 
A on this basis is described as 

d 

A{a ® e fc ) = A fc j e i ® e i- 

3,1=1 

We may identify canonically M(C d (g) C ') with A4(C d ) A^C '). Via this identification, the matrix A r 
coincides with (Id®T)(A), where T : A4(C d ) — > M(C d ) is the usual transposition map. The map T is the 
simplest example of a map which is positive but not completely positive: A^ does not imply A r 0. 

2.5. Asymptotic spectrum of partially transposed Wishart matrices: non-centered semicir- 
cular distribution. Motivated by Quantum Information Theory, we investigate the following question: 
what does the spectrum of A T look like ? As we will see, the partial transposition dramatically changes the 
spectrum: the empirical eigenvalue distribution of A r is no longer close to a Marcenko-Pastur distribution, 
but to a shifted semicircular distribution ! This is our main theorem. 

Theorem 1. Fix a > 0. For every d, let Wd be a (d 2 , [ad 2 \)- Wishart matrix, and let = Wj be the 
partial transposition ofWd- Then the empirical eigenvalue distribution ofY^ approaches the semicircular 
distribution HsC(l,ila) * n ^ e following sense. For every interval IcR and any e > ; 

lim P (| Ay (I) - M5C(l,l/a) > e) = 0. 
Recall that Ny d (I) is the proportion of eigenvalues of the matrix Yd that belong to the interval I. 

Note that the trace and the Hilbert-Schmidt norm are obviously invariant under partial transpose. The 
distributions MP{a) and SC {1,1/ a) (corresponding to eigenvalue distribution before and after applying 
partial transpose) indeed share the same first and second moments. 

The support of the limiting spectral distribution SC(1, 1/a) is the interval [1 — -^=, 1 + ^=]. Denote 
by A m i n (A) (resp. A max (^4)) the smallest (resp. largest) eigenvalue of a matrix A. A natural (and harder) 
question is whether the extreme eigenvalues of Y^ converge towards 1 ± -^=. We show that this is indeed 
the case: 

Theorem 2. Fix a > 0. For every d, let Wd be a (d 2 , [ad 2 \)- Wishart matrix, and let Yd = Wj be the 
partial transposition ofWd- Then, for every e > 0, 

lim P (|A max (y d ) - (1 + 2/y/a)\ > e) = 0, 



'An explanation for the notation is that F is "half" of the letter T which denotes the usual transposition. 
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lim P (|A min (y d ) - (1 - 2/y/a)\ > e) = 0. 

2.6. Almost sure convergence. In Random Matrix Theory, it is customary to work with the stronger 
notion of almost sure convergence. This requires to define all the objects on a single probability space. 
Such a construction is not natural from a Quantum Information Theory point of view, which usually 
"avoids infinity" and prefers to work in a fixed (but large) dimension. 

However, from a mathematical point of view, it is interesting to note that the results presented here also 
hold for almost sure convergence. One needs to check that the proof gives enough concentration in order 
to use the Borel-Cantelli lemma. A key point is the 0(l/d 2 ) estimate for the variance from Proposition 

M\ 

3. NON-CROSSING PARTITIONS AND COMBINATORICS OF WlSHART MATRICES 

3.1. Non-crossing partitions. Let S be a finite set with a total order <. Usually, S equals [k] (the 
set {1, . . . , k}) for some positive integer k, and additions in [A;] are understood modulo k. It is useful to 
represent elements of S as points on a circle. We introduce the concept of non-crossing partitions and 
refer to |20| for more information and pictures. 

• A partition tt of S is a family {V\, . . . , V p } of disjoint nonempty subsets of S, whose union is S. 
The sets Vi are called the blocks of tt. The number of blocks of tt is denoted |7r|. We denote ~ 7r 
the equivalence relation on S induced by tt: i ~ 7r j means that i and j belong to the same block. 

• A partition tt of S is said to be non-crossing if there does not exist elements i<j<k<linS 
such that i k,j ~ 7r I and i j. We denote by NC(S) the set of non-crossing partitions of S, 
and NC(k) = NC([k\). 

• A chording (or a non-crossing pair partition) of S is a non-crossing partition of S in which each 
block contains exactly two elements. Chordings exist only when the cardinal of S is even. We 
denote by NC 2 (S) the set of chordings of S, and NC 2 (k) = NC 2 ([k]). 

Counting non-crossing partitions is a well-known combinatorial problem involving Catalan numbers (see 
|20j . Lemma 8.9 and Proposition 9.4). 

Lemma 3.1. Let k G N*. The number of elements in NC(k) and the number of elements in NC 2 (2k) 
are both equal to the kth Catalan number = j^rj(fc)- 

Let us also introduce the Kreweras complementation as the map K : NC(k) h-> NC(k) defined as follows. 
For tt G NC({1~, . . . , k~}) ~ NC(k), K(ir) is defined as the coarsest partition a G NC({1 + , k + }) ~ 
NC(k) such that it U a is a non-crossing partition of {1~ , 1 + , . . . , k~, k + }, equipped with the order 

1~ < 1 + < 2" < 2+ < • • • < k~ < k + . 

The map K is bijective. Moreover, given a G NC({1 + , . . . , k + }) ~ NC(k), one can recover K^ 1 (a) 
as the coarsest partition tt G NC({1~ , . . . , k~}) ~ NC(k) such that tt U a is a non-crossing partition of 
{1~, 1 + , . . . , k~ , k + }. See |20] for more details. 

The following lemma will be used in connection to partial transposition. 

Lemma 3.2. Let tt G NC(k) a non-crossing partition and K(tt) its Kreweras complement. Then, 
(1) For every index i G [k], 

The singleton {i} is a block in K(tt) <^=^> i ~ 7r i + 1. 
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(2) For every distinct indices i,j E [k], 

The pair {i,j} is a block in K(ir) <^=^ i ~ 7r j + 1 and £ + 1 ~,r j and i 9^ j. 
Proof. This is geometrically obvious. □ 

3.2. Combinatorics related to Wishart matrices. We now remind the reader about the (standard) 
proof of the Marcenko-Pastur theorem via the moments method. This proof can be found for example in 
|15[ [2T] or the book [5]. Not only our proof will mimic this one, but we will actually strongly recycle most 
of the combinatorial lemmas. Let W n = (Wij) be a (n,p)-Wishart matrix, and k E N. The expansion of 
E -TrW!: reads 



n n 

ae\n\' 



(2) — fc E G aiiCl G a2jCl G a2iC2 G a3iC2 • • • G aktCk G aitCk . 

a£[n] k ,ce[p] fc 

The next task is to analyze which couples (a, c) give dominant contributions to the sum (|2|) when n — > 00 
and p = [an\. One argues as follows. First, if one couple (aj,Cj) or (aj+i,Cj) appears a odd number of 
times in the product, then the contribution is exactly zero (because entries of G are independent and 
symmetric). This motivates the following definition: 

Definition. A couple (a, c) E N fc x N fc satisfies the Wishart matching condition if every couple in the 
following list of 2k elements appears an even number of times: 

(3) (oi,ci), (a 2 ,ci), (a 2 ,c 2 ), (a 3 ,c 2 ), . . . , (a k ,c k ), (ai,c k ). 

Let (a, c) E N fe x N fc . We define dvK(^! c) as the number of distinct couples appearing in the list (J3|, and 
set lw(a, c) = + #c. We also denote n 2 (a, c) the number of indices i such that the zth element appears 
exactly twice in the list (|3]) , and n+ (a, c) the number of indices i such that the ith element appears at 
least 4 times. Note that re 2 (a, c) + n+(a, c) = 2fc. These parameters satisfy some inequalities: 

Lemma 3.3. Let (a, c) E N fc x N fc satisfy the Wishart matching condition. Then 

£w{S, c) ^ c) + 1 < k + 1. 

Moreover, n + (a,c) ^ 4(A; + 1 — £w(S,c)). 

Proof. Read the list (|3]) from left to right, and count how many new indices you read. The first couple 
(<2l,ci) brings two new indices, and each subsequent couple that did not appear previously in the list 
(there are dw(S,c) — 1 such couples) may bring at most one new index (since it shares a common index 
with the couple just before). This shows that £]y(a,c) ^ dw(a,c) + 1. 

The inequality dw(S, c) ^ k is easy: if every couple in the list ([3} appears at least twice, then this list 
contains at most k different couples. 

For the last claim, note that 

11 1 

d w {a, c) ^ -n 2 (a, c) + -n + (a, c) = k - -n+(a, c), 

with equality iff no element in the list ([3]) appears 6 times or more. □ 
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Now, the couples (a, c) satisfying lw(a,c) < k + 1 are easily shown to have a contribution to the sum 
(J2J) which is asymptotically zero. Let us say that (a, c) is Wishart- admissible if it satisfies the matching 
condition, together with the equality £w(S, c) = k + 1. 

If a 6 N fc , the partition induced by a, denoted 7r(a), is the partition of [k] defined as follows: i and 
j belong to the same block if and only if a, = aj. We say that a, b 6 N fc are equivalent (a ~ b) if 
7r(a) = vr(6). Similarly, a couple (a, c) is equivalent to a couple (a',c') if a ~ a' and c ~ c' . The next 
proposition (see [15] or |21| for details) characterizes the combinatorial structure of (equivalence classes 
of) Wishart-admissible couples. 

Proposition 3.4. For every integer k, 

(a) // (a, c) G N fc x N fc is Wishart-admissible, then 

(i) £^ac/i couple in the list ([3]) appears exactly twice. One occurrence is of the form (aj,Cj) while 
the other occurrence is of the form (aj+i,Cj). Moreover, the pair-partition of [Ik] induced by 
the list ([3]) is non-crossing. 

(ii) The partitions it (a) andn^c) are non- crossing, and Kreweras- complementary: tt(c) = K( / k(o)). 
In particular, a is determined by c up to equivalence. 

(b) The mapping (a, c) h-> 7r(c) induces a bijection between the set of equivalence classes of Wishart- 
admissible couples in N fe x N fc and the set NC(k). 

Example. Let us give an example of a Wishart-admissible couple for k = 4. Let a = (1,2,2,3) and 
c = (7, 3, 7, 7). Then £w(a, c) = 5. The list ([3} reads as 

(1, 7); (2, 7); (2, 3); (2, 3); (2, 7); (3, 7); (3, 7); (1, 7). 

Indeed, each couple appears exactly twice. The partition induced by this list is 

{{1,8}, {2,5}, {3,4}, {6,7}} 

while the partitions induced by a and c are 

^(a) = {{l},{2,3},{4}}, 

vr(c)=K(7r(a)) = {{l,3,4},{2}}. 

From Proposition 13. 4[ it is easy to check (if p ~ an) that limn^oo E - Tr W% coincides with the &th 
moment of the Marcenko-Pastur distribution with parameter 1/a. To obtain more information than 
convergence in expectation, one usually needs also a control of the variance of - Tr W%- The next lemma 
is then relevant. Actually, the stronger conclusion £w(S, c) + £w(S', c') ^ 2k holds, but we do not need 
this sophistication here. 

Lemma 3.5. Let (a,c) and (a'c') be two couples in N fe x N fe satisfying the following conditions 

(i) Each couple in the following list of Ak elements appears at least twice: 

(4) (ai,a), (a 2 ,ci), . . . , (a k ,c k ), (ai,c fc ) ; (4,4), (a' 2 ,c[), (a' k ,c' k ), (a[,c' k ). 

(ii) At least some couple appears both in the left half and in the right half of the list (|4]). 
Then £ w (a, c) + l w (a', c') ^ 2k + 1. 

Proof As before, we read the list (|4|) and keep track of the number of indices. We first read the left half of 
the list in its natural order. We then read the right half of the list, starting by an element which already 
appeared in the left half and reading from left to right — with the convention that (4,4) stands at the 
right of (4,c fc ). 
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The first element (ai,ci) brings two new indices, and each subsequent new couple (there are at most 
2k — 1 many, since each couple in the list appears at least twice) brings at most one new index. □ 

If we want to prove estimates on the extreme eigenvalues of Wishart matrices, we also have to analyze 
lower-order contributions. We here follow the terminology from |10] , Let (a,c) G N fc x N fc satisfy the 
Wishart matching condition. The elements from the list ([3j fall into one of the following categories. 

type 1: innovations for a. 

type 2: innovations for c. 

type 3: first repetitions of an innovation. 

type 4: other elements. 

The ith element in the list ([3]) is an innovation if it contains one index which did not appear already in the 
list. When i = 2p is even, the ith element is an innovation for a if a p +i G" {aj : j < p}. When i = 2p — 1 
is odd, the ith element is an innovation for c if c p G" {cj : j < p}. In particular, the first element of the 
list ([3j is always an innovation for c. 

The ith element is the first repetition of an innovation if there is a unique j < i such that the jth 
element from the list (j3|) equals the ith element, and moreover this jth element is an innovation. 

The following lemma asserts that there are few different couples satisfying the Wishart matching con- 
dition which have the same types of elements at the same positions. We refer to |10] for a proof. 

Lemma 3.6. Let T = (ti, . . . , t 2 k) G {1, 2, 3, 4} 2fc , and let U = card{i G [2k] : U = 4}. Say that (a, c) is 
of type T if, for every i G [2k], the ith element in the list ([3j has type ti. Then, the number of equivalence 
classes of couples satisfying the Wishart matching condition which are of type T is bounded by k 3U . 

3.3. Diagonal elements of Wishart matrices are close to 1. We will use the following simple fact 
in our proof. 

Lemma 3.7. Let W = (W«) be a (n,p)-Wishart matrix. Then, for any e G (0, 1), we have 

P ( 1 - e < inf Wa^ sup Wu < 1 + e J ^ 1 - Cnexp(-cpe 2 ), 
where C, c > are absolute constants. 

Proof. Recall that W = ^GG\ where G = (G{j) is a n x p matrix with independent N(0, 1) entries, so 
that the diagonal terms of W n follow a x 2 distribution 

The next fact shows that such distributions enjoy very strong concentration properties. 

Fact 3.8. Let gi,...,g p denote independent (real or complex) N(0, 1) random variables, and X be the 
Euclidean norm of the vector (gi, . . . ,g p ). Then for every t > 0, 

PflX-v^l >t) < C'exp(-c't 2 ). 

Fact 13.81 can be proved by direct calculation or follows from concentration of measure (see e.g.|17j). 
Indeed, the Euclidean norm is a 1-Lipschitz function and the expectation of X satisfies the inequalities 
sfp — 1 ^ EX ^ ypp. Lemma |3. 71 follows from Fact 13.81 via the union bound. □ 
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4. Proof of Theorem [TJ 

For an integer d and p = [ad 2 \ , let Gd be a d 2 xp matrix with independent N(0, 1) entries, Wd = ^GdG d 
and Yd = Wj. We denote the entries of Gd as (Gfj), where G [d] x [d] denote the row indices and 

k G [p] denotes the column index. We label the entries of Wd and Yd as {Wff, ) and (Y?$ ), where 
(a,a' ,b,b') G [ci] 4 according to the convention described in Section [2.41 

We have to show that Ny d , the empirical eigenvalue distribution of Yd, approaches a non-centered 
semicircular distribution SC(1, a). To handle a more symmetric situation (involving a centered semicircular 
distribution), we will rather consider Yd — Id. By Lemma [3. 71 this matrix is very close to Zd = Yd — diag(Y^). 
The latter behaves in a nicer way with respect to moments combinatorics. We label the entries of Zd as 

( Z M' )i,i'J,j'e[d\- We nave 

yj,f _ \rj,j'-t , % 
Z i,i> ~ Y i,i> 

The following proposition is central to our work. We defer the proof (the combinatorial part of the 
moments method) to the next section. 

Proposition 4.1. For every fixed integer k, we have 

lim E f 1 Tr(Z^ - J*"*^*/* lf k 18 eVen ' 



lim Var ( — Tr(Z^) ] = 0. 



d-Kx \d 2 J 1 otherwise. 

We also show that the variance goes to zero — this is actually simpler. 

Proposition 4.2. For every fixed integer k, we have 

1 

d^roo \d 2 

The proofs of Proposition 14.11 and 14.21 appear in Sections [5] and [6J respectively. 
Proof of Theorem[l\ (assuming Propositions ^. 1\ and \4-2ty - We claim that for any interval IcR and e > 0, 
(5) lim P (\N Zd (I) - M5C(o,i/«) W| > e) = 0. 

d— >OQ 

Deriving this from Propositions 14.11 and 14.21 is a completely standard procedure. We only sketch the 
proof and refer to [T| (pp 10-11) for more details. Recall that the Catalan numbers satisfy ^ 4 fc , 
and that the support of the 5(7(0, 1/a) distribution is [— 2/y / a, 2/y/a\. We first check that the proportion 
of eigenvalues outside J = [—3/y/a,3/y/a\ is asymptotically zero. For every e > and even integer k, 

lim sup P (N Zd (J c ) > e) ^ - lim sup E N Zd (J c ) 

d— >oo ^ d— >oo 



< -lim sup E f x k (^/3) k dN Zd 

£ d— >oo J 



1 



< -jv^m k c k/2 a- k / 2 



£ 

^ - £ (2/3) k , 



where the second inequality follows from ljc(x) ^ x k (T/a/3) k . Since k is arbitrarily large, we obtain that 
P(N Zd (J c ) > e) tends to 0. 



10 GUILLAUME AUBRUN 



Therefore, to prove ([5]), we may assume I C J. Using the Weierstrass approximation theorem, we may 
find a polynomial Q ^ lj such that J QdfJ,sc(o,i/a) ^ / u 5C(o,i/a)(^) + £ /2- It follows from Proposition 4.1 
that 



lim Var / QdN Zd = 0. 

d— >oo J 

For d large enough, | E J QdNz d — J Qdnsc(o,l/a)\ < e /4- Then 

P (N Zd (I) > iM S C(o,i/a)(.I) + e) < P (/ QdAfc d > E | QdiV Zd + e/4 



< ^2 Var / QdN Zd 

and this quantity tends to zero. This is only half of ([5]). The other half follows by noticing that 

P (N Zd (I) ^ msc(o,i/«) W " e) < P (iV> d (J \ /) > Hsc{o,i/«)(J \ I) + e/2) + P (iVz d (J c ) 2* e/2) 

and applying the previous argument to J\I. 

We now argue that the empirical eigenvalue distribution is stable under small perturbations. Indeed, 
for any interval [a, b] and any self-adjoint matrix with operator norm smaller than 5, 

(6) N Zd+Ad ([a + 6,b-6})^ N Zd ([a, b]) < N Zd+Ad ([a -8,b + S\). 

This is a consequence of the minimax formula for eigenvalues (see e.g. jS], Chapter III). We apply © with 
Ad = diag(Yd) — Id. By Lemma [3. 7 [ for every e > 0, P(||A^|| > e) tends to when d tends to infinity. We 
easily derive from (|5]) and (j6j) that, for any interval I, 

Hm P [\N Yd -i d (I) - Msc(o,i/ Q )0O| > e) = 0. 

This is clearly equivalent to Theorem [1] □ 

5. Proof of Proposition 14.11 
We expand E Tr(Z^) and analyze the underlying combinatorics. 

Se[d] k ,b£[d] k 

- — ^TTl, , w, , F V fel ' &2 . V b2 ' b 3 y&k-l,*>k _ v b k M 

H i («iA)FK+iA+i) ^ J oi,i2 1 0,2,0,3 ■ ■ ■ 1 »fc-i,oj. ^aj-jOj 

= ^ E M & g ) E < 2 £ • w ol: b oi ■ ■ ■ wt-tal ■ <\t 



Se[d] k ,bE[d] k 

1 



ae[d] k ,b£[d] k ,c£[p] k 



M(a,b)En(a,b,c). 
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where we have defined 

k 

M(a,b) = Y[ l (ai , 6i )^(a i+1 ,6 i+ i) 
1=1 

and 

v\i7t h pa — n Ci n Cl n C2 n C2 ^ c k-i ^k-i ^ic k ^c k 

LL{d,0,C) - ^ aub2 ^ a2M ■ ^02,63^03,63 U a fc _ 1) & fc Lr ak>6k-l ' °-k ,61 U ai ,&k ' 

We introduce some definitions in order to restrict ourselves to triples for which both M(a, b) and 
E II(a, b, c) are nonzero. 

Definition. A couple (a,b) G N fc x N fc is said to be non-repeating if M(a,b) = 1. In other words, (a, 6) 
is non-repeating if for every i G [fe], either a% 7^ Oj+i or bi 7^ 

Because the entries of Gd are independent, we may factorize ELT(a, b, c) as a product of quantities of 
the form E(Gj J .)«(Gj J .) r - Such a quantity is zero unless q = r, and EII(a, 6, c) is zero whenever one of 
these factors is zero. 

Definition. A triple (a, b, c) G N fc x N fc x N fc satisfies the matching condition if, in the following list of 
2k triples, each triple appears an even number of times 

(7) (ai,6 2 ,ci), (a 2 ,6i,ci) ; (a 2 , 63, c 2 ), (a 3 , 6 2 , c 2 ) ; ... ; (a fc , 61, c fc ), (ai, c fc ). 

Therefore, if a triple (a, 6, c) does not satisfy the matching condition, then E Il(a, 6, c) = both in the 
real and in the complex cases. The following easy observation will be used repeatedly. 

Fact 5.1. Assume that (a,b,c) satisfies the matching condition. Then both (a, c) and (b,c) satisfy the 
Wishart matching condition. 

Recall the definition of equivalence introduced just before Proposition 13.41 a ~ a 1 means that a and 
a' induce the same partition, and (a,b,c) ~ (a',b',c') means a ~ a', b ~ b' and c ~ c'. Let C be the 
equivalence class of a triple (a, b, c). When d — > 00 

(8) card{C n {[d] k x [d] k x [p] k )} ~ d**d* l p** ~ a * S d e( - S '^ 
where we have defined 

£(a,6,c) = #a + #6 + 2#c. 
Together with Lemma 13. 3( Fact 15.11 implies that whenever (a, 6, c) satisfies the matching condition, 

4(a, 6, c) = £ w (o, c) + £ w (b, c)^2k + 2. 

Let ^ be the (finite) family of all equivalence classes of triples (a, b, c) G N fc x N fc x N fe which satisfy the 
matching condition. Since the quantities M(a, b), ELT(a, b, c) and £(a, b, c) depend only on the equivalence 
class C G % of the triple (a, 6, c), we may abusively write M(C), E n(C) and ^(C). We also write 7(C) to 
denote j^c. Note that these quantities do not depend on the dimension d. We rearrange the sum according 
to equivalence classes of triples: 

(9) Jim ±ETrZ% = \ £ M(C)EIL(C) lim card{C n ([d] fe x [d] fe x 

cee k 

Definition 5.2. Let us say that a triple (a, 6, c) is admissible if the following three conditions are satisfied 
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(1) (a,b,c) satisfies the matching condition, 

(2) (a, b) is non-repeating, 

(3) £(a, b,c) = 2k + 2. 

Denote by ^ 7 | ldm C ^ the set of equivalence classes of admissible triples. 

Equation (j9|) implies that 
(10) hmlETrZ d fc = 2_ ^ M(C)EU(C)a^ c \ 

Proposition 5.3. If(a,b,c) G N fc x N fc x N fc is admissible, then 

(1) M(a,b) = l, 

(2) EU(a,b,c) = 1 

(3) k is even 

(4) #c = fc/2. 

Moreover, the number of equivalence classes of admissible triples in N fc x N fc x N fc is egua/ to i/ie Catalan 
number 

Once Proposition 15.31 is proved, Proposition 14. II is immediate from (|lQjl . 

Proof of Proposition \5.3l The fact that M(a, 6) = 1 is just a reformulation of the non-repeating condition. 
We now check that E II(a, b, c) = 1. Indeed, since (a, c) is Wishart-admissible, every element in the list ([3j 
appears exactly twice, once at an odd position and once at an even position. But the same must be true 
for the list (JJJ), and therefore Eil(a, b, c) = 1. To check the last two conditions, we rely on the following 
lemma 

Lemma 5.4. Let (a,b,c) G N fe x N fc x N fc which satisfies the matching condition and such that (a,b) is 
non-repeating. Then 

(1) No index in c appears only once, and therefore #c ^ L^/2j> 

(2) #a + #b^2([k/2\ +1). 

Proof By contraposition, suppose that some index a appears only once in c, i.e. that c,- 7^ Cj for every 
j ^ i- The matching condition imposes the equality 

(a i+ i,bi,a) = (ai,b i+ i,Ci) 

which in turn implies (a*, fej) = (oj+i, frj+i), contradicting the non-repeating property. For the second part 
of the lemma, we argue differently according to the parity of k 

(k odd) Define (x, y) G N fc x N fc as follows 

x = (ai,a 3 ,..., a k -2, ajt, 02, 04, • • • , Ofc-i), y = (&2, &4, • - • ; &jfe-i, &i, &3 3 • • • , &fc-2, 

The matching condition implies that (x, y) is Wishart-admissible. Therefore, by Lemma 13. 3| we 
have f^x + f^y ^ k + 1. Since x (resp. y) is a permutation of a (resp. 6), we have 

#a + #6^ fc + 1 = 2(Lfc/2j +1). 
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(k even) Define (xi,yi) and {x2,yi) £ N fc / 2 x N fc / 2 as follows 

x\ = (ai,a 3 , . . . ,ojfc_i), yi = (b 2 ,bi, . . . ,b k ), 

x 2 = (a2,«4, • • • ,a k ), m = (h,h, ■ ■ -,h-i,h)- 
Then both (x\,yi) and (x2,y2) are Wishart-admissible. Therefore, using Lemma |3.3| we obtain 

#a + #b < #x x + #x 2 + #yi + #y 2 ^ 2(fc/2 + 1). 

In both cases we proved #a + #6 < 2([A;/2j + 1). □ 

We continue the proof of Proposition 15.31 If (a, b, c) is admissible, Lemma 15.41 implies that 2k + 2 = 
£(a,b,c) ^ 4[fc/2j +2. Therefore, /c must be even, and necessarily #c = /c/2 and each index in c appears 
exactly twice. 

To prove the last statement in Proposition 15. 3| we are going to show that the following map G 

^adm ^ NC 2 (k) 

(a,b,c) h-> 7r(c) 

is bijective. First, the partition induced by c is indeed a chording of [k] (this partition is non-crossing 
since (a, c) is Wishart-admissible) . Because an element of a Wishart-admissible couple is determined (up 
to equivalence) by the other one, it follows that the map is injective. 

We now show that this map is onto. Given a partial chording tt £ iVC^A;), there is a Wishart-admissible 
couple (a, c) G N fc x N fc such that 7r(c) = tt. It remains to check that (a, a, c) is admissible. 

• The couple (a, a) is non-repeating. Otherwise, one would have = dj+i for some index i E [k]. 
Since 7r(c) = K(iT(a)), this would imply by Lemma 13.21 that {i} is a block in vr(c), which is not 
possible if ir(c) is a chording. 

• T/ie irip/e (a, a, c) satisfies the matching condition. Since we already know that (a, c) satisfies the 
Wishart matching condition, we have to check the following: whenever (aj,Cj) = (oj+i, c-,), we 
have Oj+i = a,,-. Suppose (aj,Cj) = (aj+i,Cj). Since (a, a) is non repeating, we have i ^ j. This 
implies that must be a block in ir(c) and the result now follows from the second part of 
Lemma 13.21 

Therefore, the map is bijective, and the cardinal of equals the cardinal of NC>z(k), which by 

Lemma [3. II is the Catalan number Cf./^- □ 

6. Proof of Proposition 14.21 
Start with a formula from the previous section 

ae[d]' c ,fte[(i]'=,ee[p]* 

The covariance of two random variables X, Y is defined as Cov(X, Y) = E(X7) — E X ■ E Y. We have 
(11) Var^TrY]? = — L E M & *)M{a ', b') Cov(n(a, b, c), IL(a' , b', c% 

a,b,c,a',b ',5' 

where the summation is taken over indices a, b, a', b' in [d] k , and c, c' in \p] k . We first identify the vanishing 
contributions. 
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Lemma 6.1. Let (a,b,c) and (a',b',c f ) be two triples in N fc x N fc x N fc such that 

Cov(n(a, b, c),U(a', b', c')) ^ 0. 
Then £(a, b,c) + £{a' , b', c') s^Ak + 2. 

Proof. The independence of entries of Gd shows that the following two conditions must hold: 

• Each couple in the following list of 4/c elements appears at least twice: 

(12) (a 1 ,b 2 ,ci), (a 2 ,h,ci), ... , (a k ,bi,c k ), (a±,b k ,c k ) ; (a[, b' 2 , c^), (a' 2 , bf l7 c^), . . . , (a' fc , b[, d k ) 7 (a[,b' k ,(? k ). 

• At least some couple appears both in the left half and in the right half of the list (|12p . Otherwise, 
the random variables Tl(a, b, c) and Tl(a' , b' , c') would be independent, and their covariance would 
be zero. 

As is immediately checked, these conditions imply that a,c,a',c' satisfy the hypotheses of lemma [375] 
Therefore, 

£ w (a, c) + £ w (a', c') < 2k + 1. 
Similarly, one may apply Lemma 13.51 to b, c, b' , c' to obtain 

£ w (b,c)+£ w (b',c) <2fc + l. 
It remains to add both inequalities. □ 

We now gather the non-zero terms appearing in the sum (jlip according to the equivalence class of 
(a, b, c, a', 6', c'). The cardinality of the equivalence class of (a, b, c, a' ,b' , c') is bounded by 

d #a+#b+#d'+#b' p #c+#c> = q ^ d e(a,b,S)+i(a'p,5')\ = q (^± k +A . 

The overall factor l/d 4 p 2fc = (9(l/(i 4fe+4 ) in front of the sum (jlip shows that each class has contribution 
asymptotically zero. Since the number of equivalence classes depends only on k, this proves the lemma. 

7. Convergence of extreme eigenvalues: proof of Theorem [2] 

Let Gd be a (J 2 x p matrix with independent A r (0, 1) entries, Wd = pGdG d , Yd = and Zd = 

Yd — diag(y^). Assume that p = [ad 2 \ . 

Half of Theorem [2] can be deduced from Theorem [T] Indeed, for every e > 0, let I be the interval 
[1 + 2/^/a — e, 1 + 2/y / a]. Since ^sC{l,l/o.)iJ) > 0, Theorem 1 implies that, with probability tending to 1, 
Ny d {I) > 0, which means A max (Yd) ^ l + 2/^/a— e. A similar argument shows that A m i n (y,i) ^ 1 — 2/y/a+e 
with probability tending to 1. 

To prove the other half of Theorem |2] (the hard part), we are going to give an upper bound onETr(Zf) 
which holds in any fixed dimension (as opposed to asymptotic estimates from the previous sections). 

Proposition 7.1. There is a polynomial Q such that, for any integer k, 

ETr(^) < (2/p) k (d + Q(k)) k+2 (^ + Q(k)) k . 

Assume for the moment that Proposition 17. 1 1 is true. We claim that it implies that for every e > 0, 

lim P (\\Y d - Id|| ^ 2/v^ + e) =0, 

d— >oo 
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from which Theorem [T] follows. Indeed, choose k = k(d) an even integer such that Q(k) = o(d) and 
logci = o(k). Then, when d —> oo, Proposition 17.11 implies 

EM '«E^)«(^ +0 (l))^(-| + 0(i: 
Therefore, it follows from Markov's inequality that for every e > 0, 

P(||Z d ||>2/v^ + e) < (JL + o(l)) + — >"0- 



On the other hand, by Lemma |3.7[ 

P (|| diag(Y d ) - Id|| > e) < d 2 exp(-cpe 2 ) — ► 0. 
This completes the proof of Theorem [2] since 

P(||y d -Id|| + <P(||diag(y d )-Id|| >e/2) + V{\\Z d \\ >2/^ + e/2). 

Proof of Proposition I7.il Recall the computation from Section [5] 

(13) Tr{Z k d ) = \ M(a,6)n(a,6,c). 

ae[d] k ,b£[d] k ,c£[p] k 

We first give an upper bound on EII(a, b, c). 

Lemma 7.2. Let (a,b,c) € N fc x N fc x N fc satisfy the matching condition, and denote 

A = 2k + 2- l(a,b,c). 

Note that A ^ 0. T/ien 

(1) The number N of indices i £ [2fc] suc/i i/iat the ith term in the list ([7J appears 4 times or more is 
bounded by 2A, 

(2) We have EII(a, b, c) ^ (Cok)^, where Cq is an absolute constant. 

Proof. At least one of the numbers k + 1 — lw(S, c) and k + 1 — £w{b, cj must be smaller than A/2, since 
their sum equals A. Without loss of generality, we may assume that k + 1 — £w(a,c) ^ A/2. Then, 
Lemma [3.31 implies that n+(a, c) ^ 2 A. Since N ^ n+(a, c), the first part of the lemma follows. 

For the second part, we use independence to write EII(a, b,c) as a product of quantities of the form 

E ( G i,j) qi (Gf~) q2 < E \ G i,j\ 9l+q2 - If G is a N(0, 1) random variable, then E |G| 2n equals 1 • 3 • 5 • • • (2n - 1) 
in the real case and n\ in the complex case. In both cases, for some constant Co, 



(14) E|G| £ - 



= 1 if q = 2, 

_<(CV?)« if Q>2. 

Bounding each individual factor according to (|T4|) and using q ^2k leads to 

EII(a,&,c) s$ (CoV^)* 

and the second part of the lemma follows. □ 
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The number of triples in [d] k x [d] k x [p] k equivalent to a given triple (a,b,c) is equal to 

d{d - 1) • • • (d- #a + 1) • d{d - 1) • • • (d - #b + 1) • p(p - 1) ■ ■ • (p - #c + 1) ^ d*^* l p*^. 

Therefore, it is convenient to rearrange the sum (jl 3[) according to the values of #a + #6 and #c. We 
denote by the number of equivalence classes of triples (a,b,c) G N fc x N fc x N fc which satisfy the 

matching condition, with (a,b) non-repeating, #a + #6 = l\ and #c = £2- It follows from the analysis 
above that 

(15) ETr(y fc ) < 1 £ S-^mt^C^^- 1 ^. 

p £1,12 

By Lemma 15.41 £ a = if either ^1 > + 2 or £2 > fc/2. It remains to give a bound on the number 
me lt e 2 . This is the content of the following proposition (we postpone the proof to the end of the section). 

Proposition 7.3. There is a polynomial P such that the following holds. Denote by the number of 
equivalence classes of triples (a, b, c) G N fc x N fc x N fc which satisfy the matching condition, with (a, b) 
non-repeating and £(a, b,c) = 2k + 2 — A. We have the bound 

(16) N A < 2 k P(k) A . 

Remark. The bound given in (jl6[) is quite sharp. Indeed, for A = 0, it gives N ^ 2 k . But N is exactly 
the number of equivalence classes of admissible triples considered in Section where this number was 
shown to equal the Catalan number C k i 2l only slightly smaller that 2 k . 

We continue the proof of Proposition 17.11 We have 

m hA ^ N 2k+2 _ h _ 2h ^ 2 k P(k) 2k+2 -^- 2e \ 

Plugging this into (I15p and denoting Q the polynomial Q(k) = CokP(k), 

k k+2 fe/2 



ETr(^) < 

p e 1= 2£ 2 =i 



9 k k+2 \ k/2 

P \h=2 J \e 2 =i 

< (2/p) k (d + Q(k)) k+2 (^ + Q(k)) k . 

This completes the proof of Proposition 17.11 □ 

Proof of Proposition 1 7. 31 For (a, b,c) G N fc x N fe x N fc , let / = I (a, b, c) C [k — 1] be the subset of indices 
i such that the following condition holds 

(1) eii-fi {aj : j < i + 1} — one says that aj+i is an innovation, 

(2) {6j : j < i + 1} — one says that bj+i is an innovation, 

(3) Cj {cj : j < i} — one says that c,- is an innovation. 

The next lemma shows that the set I (a, b, c) is large when A is small. We postpone the proof. 



PARTIAL TRANSPOSITION OF RANDOM STATES AND NON-CENTERED SEMICIRCULAR DISTRIBUTIONS 17 



Lemma 7.4. If (a, b, c] G N fc x N fc x N fc satisfies the matching condition with (a, 6) non-repeating, then 

card J(a, 6, c) ^ fc/ 2 ~ A - 

w/iere A = (2k + 2) - £(a, 6, c) . 

Let j4, C be subsets of [k]. A couple (a,c) satisfying the Wishart matching condition is said to be 
compatible with (A, C) if 

(1) for every i G A, the index o« is an innovation, i.e. Oj ^ {aj : j < i}, 

(2) for every i £ C, the index c, is an innovation, i.e. Cj ^ {cj : j < i}. 

Note that if a Wishart-admissible couple (a, c) is compatible with (A, C), then by arguing as in the 
proof of Lemma 13. 3[ we have 

card A + card C < dw (a, c) + 1 ^ k + 1. 
Let us state one more lemma, postponing the proof. 

Lemma 7.5. Let A,C be subsets of [k], and 5 = k + 1 — card A — cardC. The number of equivalence 
classes of couples (a, c) 6 N fc x N fc which satisfy the Wishart matching condition and are compatible with 
(A,C) is bounded by (2k) 95 . 

The number iV~A is the number (up to equivalence) of triples (a, b, c) which satisfies the matching 
condition, with (a, b) non repeating, and £(a,b,c) = (2k + 2) — A. To bound iV^, we first choose a set 
I C [k — 1] of cardinal larger than k/2 — A. The number of possibilities for / is bounded by 2 k . Now, 
given /, let I + be the subset of [k] defined as 

j G I + j = 1 or j — 1 G I. 

If (a, b, c) satisfies the matching condition with I (a, b,c) = I, then it is easily checked that both couples 
(a, c) and (b, c) are compatible with (I + ,I). We have card(/ + ) + card(I) = 2 card(J) + 1 ^ k + 1 — 2A. By 
Lemma l7.5[ the number of admissible couples compatible with (I + ,I) is bounded by (2/c) 18A . Therefore 
the number of possible triples (a, 6, c) is bounded by (2/c) 36 . This yields the bound 

N A ^ 2 k (2kf 6A . 

This proves Proposition E3] with P(k) = (2k) 36 . □ 
Proof of Lemma \7.4\ For each index i G [k], one of the following possibility occurs 

Pi(i). The indices ai+\,bi + \ and q are innovations. Necessarily the triples (aj,6j+i,Cj) and (aj+i, frj, q) 
are innovation^. 

P2(i)- The triples (aj, Cj) and (of+i, 6j, q) are innovations, but at least one of aj+i, and Cj is not 
an innovation. 

Ps(i). Only one of the triples (aj, ftj+i, q) and (aj + i,&j,Cj) is an innovation. 
Pn(i): Neither (aj,6j+i,Cj) nor (aj+i,6j,Cj) is an innovation. 

For j G {1,2,3,4}, let be the number of indices i G [k] such that Pj(i) holds in the above alternative. 
With this notation, n\ = cardl(a,b,c). The numbers rai,n2,n3,n4 satisfy the following relations 

(17) ni + n 2 + n 3 + n 4 = fc, 



3 We say that a triple at jth. position from the list © is an innovation if it does not coincide with a triple at ith position 
for i < j. 
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(18) n 3 + 2n 4 > k, 

(19) 4ni + 3n 2 + n 3 ^ 2A; - A. 

• Equation (iT7"|) is obvious since possibilities P\(i), ■ ■ ■ ,-Ri(i) are mutually exclusive. 

• There must be at least k elements in the list ([7} which are not innovations, since every element 
must appear at least twice. But the number of non-innovations in the list ([7} is equal to 77.3 + 2n 4 , 
hence the equation (118p . 

• For each i, let Z{ be the number 

l{ai+i is an innovation} is an innovation} 2 " ljc^ is an innovation} - 

The value of Z{ depends on which of Pi(i), P 2 (i), i-^i), P 4 (i) occurs. If Pi(i) occurs, then Zi = 4. 
If Pn{i) occurs, then Z{ = 0. If occurs, then Z{ ^ 3. If Ps(i) occurs, then Z{ ^ 1. This last 
point deserves some explanation. 

— If (aj,6j + i,Cj) is not an innovation, then certainly and Cj cannot be innovations. 

— If instead (aj+i, 6j, Cj) is not an innovation, then aj+i cannot be an innovation. We claim 
that Ci is also not an innovation. Indeed, if Cj was an innovation, then necessarily (a^+i, 6j, q) 
would be equal to (oj, frj+i, Cj) which would contradict the non-repeating property. 

This shows that ^ Z\ ^ 4ni + 3n 2 + 713. On the other hand, we have 

k 

Y J Z i = #a-l + #b- 1 + 2#c = 2k - A. 

i=i 

Therefore, the above discussion implies equation (|19p . 
Adding (jl9[) and twice (|18p . we obtain 

4ni + 3n 2 + 3n 3 + 4n 4 ^ 4k - A. 

Together with (|17p . this implies that n 2 + ^3 A. Since 77.3 ^ 0, this in turn implies 3n 2 + ?i3 $J 3A. 
Combined with (|19p . we obtain 4ni ^ 2k — 4A, hence m ^ /c/2 — A as claimed. □ 

Proof of Lemma \7. 51 Given a couple (a, c) £ N fc x N fc satisfying the Wishart matching condition, there is 
a partition of [2k] as 

(20) [2k] = Ti U T 2 U T 3 U T 4 

where T, denotes the set of indices j such that the jth element in the list (|3]) is of type i (the four possible 
types have been defined in Section |2]). If the couple (a,c) is compatible with (A,C), then necessarily 
T* C T\ and T| C T 2 , where 

I? = {2(i-1) : i€A,i^l}, 
T 2 * = {2i - 1 : iG C}. 

We claim that the number of partitions (I20p satisfying these constraints is bounded by (2k) 3S . Indeed, 
we first have to enlarge T* into T\ and T 2 * m t° ^2- Since card(T* U T 2 *) = k — 5 and card(Ti U T 2 ) ^ A;, 
the number of possible ways to perform these enlargements in at most (2k) s . 

Since card(Ts) = card(Ti) +card(T 2 ), we have card(T 4 ) ^ 28. Therefore the number of possible choices 
for T 4 is bounded by (2k) 2S . Once Ti,T 2 and T 4 are chosen, the set T3 consists of the remaining indices. 
Hence the claim on the number of possible partitions. 
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Now, by Lemma 13. 6( the number of equivalence classes of couples satisfying the Wishart matching 
condition with a given partition (I20p is bounded by 

(2A;) 3cardT4 sC {2kf 5 . 

Finally, the total number of equivalence classes satisfying the Wishart matching condition and compat- 
ible with (A, C) is bounded by (2kf & . □ 

8. Relevance to Quantum Information Theory 

In this section we consider finite-dimensional complex Hilbert spaces. We write A4(C n ) for the space 
of linear operators (^matrices) on C n . 

8.1. PPT states. A state (=density matrix) p on C n is a positive operator on C n with trace 1. We write 
D(C n ) for the set of states on C™. A pure state is a rank one state and is denoted p = \x)(x\, where x 
is a unit vector in the range of p. We typically consider the case C ~ C d <%> C d . We have the following 
canonical identification 

M(C d ® C d ) ~ M(C d ) ®M(C d ). 
A state p £ T>(C d ® C d ) if called separable if it can be written as a convex combination of product states. 
A state p is called PPT ("positive partial transpose") if p r is a positive operator (the partial transposition 
p r = (Id <%> T)p was defined in (JTJ). The partial transposition of a separable state p is always positive 
|22j ; however there exist non-separable (=entangled) PPT states. For many purposes, checking positivity 
of the partial transpose is the most efficient tool to detect entanglement. We refer to the survey |14| for 
more information about PPT states and entanglement. 

8.2. Random induced states are normalized Wishart matrices. There is a canonical probability 
measure on the set of pure states on any finite-dimensional Hilbert space H, obtained by pushing forward 
the uniform measure on the unit sphere of H under the map x t— >■ We define the measure p n<p to 
be the distribution of Trcp where x is uniformly distributed on the unit sphere of C n ® C p . The 
partial trace Ttqp is the linear operation 

Tr CP := Id M(c «) ® Tr : M(C n ® C p ) -> M(C n ), 

where Id^^n) is the identity operation on A4(C n ) and Tr : M(C P ) — > C is the usual trace. 

The measure p UtP is a probability measure on D(C n ), the set of mixed states on C n . A random state 
with distribution p n<p is called an induced state; the space C p is called the ancilla space. This family of 
measures has a simple physical motivation: they can be used if our only knowledge about a state is the 
dimensionality of the environment (see [7], Section 14.5 and references therein). 

Induced states are closely related to Wishart distributions. Indeed, if W is a (n, p)- Wishart random 
matrix, then rpAy W is a random state with distribution p njP . Moreover, the random variables TrW and 
rfyrprr W are independent (this fact explicitly appears in |19]). Therefore, results about Wishart matrices can 
be easily translated in the language of induced states. The special case p = n, when the dimension of the 
ancilla equals the dimension of the system, deserves to be highlighted thanks to the following Proposition 

pg. 

Proposition 8.1. The measure p n ^ n is equal to the normalized Lebesgue measure restricted to the set 
V{C n ). 

Proposition 18. II follows from a more general fact |24| : whenever p n, the density of the measure // np 
with respect to the Lebesgue measure on D(C n ) is proportional to det(p) p ~ n . 
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8.3. Partial transposition of random induced states. Our main results admit an immediate trans- 
lation in the language of random induced states. Here is a version of Theorem [T] for induced states. 

Theorem 3. Fix a > 0. For each d, let p d be a random state on C d ®C d chosen according to the measure 
Pd 2 ,[ad 2 \ ■ Then for every interval I = [a, b] C R and e > 0, 



lim P 

d— >oo 



N cPp v d i I ) -MSC(l,l/a)C0 



> e 



0. 



Recall that N d 2 p r (I) is the proportion of eigenvalues of the matrix p^ that belong to the interval [a/d 2 ,b/d 2 ] . 



Proof. If W is a (d ,p)-Wishart matrix, then Tj^p has distribution as Pd 2 ,p- Therefore, 

~TyW TrW~' 



N d 2 p r([a,b}) 



N_*_ wT ([a,b]) = N, 



d 2 



d 2 



The distribution of is proportional to a \ 2 distribution. Using Fact 13.81 to quantify its concentration, 
we obtain that for any r\ > 0, 

TrW 



(21) 



d 2 



1 



> rj ^ C exp(— cd pr\ 



When | T ^ — l| ^ 77, we may use the inclusions 



[(1 + V )a, (1 - r,)b] C 
to show that Theorem [1] implies Theorem |3j 



a, — ^— b 



d 2 



d 2 



C [(1-77)0,(1+77)6] 



□ 



If d is fixed, the induced measures p^, 2 „ concentrate towards the maximally mixed state on C c 



when p increases. For small values of p, one expects to get typically very entangled states. Therefore 
one can consider the critical p for which the property "being PPT" becomes typically true. The following 
theorem shows that a threshold occurs when p = Ad 2 . 

Theorem 4. For every e > 0, there exist positive constants c(e),C(e) such that the following holds. If p 
is a random state on C d (g) C d chosen according to the measure Pd 2 ,p> then 

(1) Ifp (A-e)d 2 , then 

P(p is PPT) s; C(e) exp(-c(£». 

(2) Ifp ^ {A + e)d 2 , then 

P(p is PPT) ^ 1 - C(e)exp(-c(e)p). 

Proof. We only show the proof of (1), the proof of (2) being similar. We are going to use a concentration 
argument from [3], where the same question is studied for separability instead of PPT. We start by a 
lemma that compares the probability that a random state is PPT, for different dimensions. 

Lemma 8.2. Let di,d2,d' 1 ,d' 2 and p be integers, with d\ ^ d\ and d' 2 ^ e?2- Let p be a random state on 
C dl ® C d2 with distribution Pd 1 d 2 ,p> an d let p' be a random state on C^' 1 (g) C d ' 2 with distribution [J>d' dL,p- 
Then 

P(p is PPT) ^ P(p is PPT). 
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Proof. It is enough to prove the lemma in the special case c?2 = d' 2 (since both factors play the same role, 
the full version follows by applying twice this special case). 

We construct a coupling between both distributions as follows. Identify C dl as a subspace of C dl , and 
let Q : C dl -> C d 'i be the orthogonal projection. Then, C d 'i <g> C d2 C C dl ® C d2 is the range of the 
projection P = Q ® Id. Let If be a (di<i2,p)-Wishart matrix, seen as an operator on C dl (g) C d2 . The 
random operator PWP, when seen as an operator on C^'ngiC^ 2 , has the distribution of a (d^d2,p)-Wishart 
matrix. Therefore, the states 

W 

, _ PpP _ PWP 
9 ~ TrPpP ~ Tr PWP' 
have respective distributions /U f j 1 ^ 2jP and p d > ld2jP - To P rove the lemma it remains to check that 

p is PPT =>- p' is PPT. 

This implication holds because (PpP) r = Pp r P. □ 

Fix e > 0. As a consequence of Lemma 18. 2| it is enough to prove Theorem [4J for every given p, when d 
is minimal such that p ^ (4 — e)d 2 (from now one, we assume that d and s are related by this condition). 

Denote by || • ||ppt the gauge associated to the convex body of all PPT states. This gauge is defined as 
follows, for any state p on C d ® C d 

Id 1 ( Id 

= 1 - d 2 X min (p r ). 

Note in particular that p is PPT if and only if ||/?||ppt ^ 1- Let Pep p be a random state with distribution 
p>d 2 ,pi anf i denote by M^2 p the median of the random variable \\p d 2 iP ||ppt- By applying Proposition 4.2 
from [3], we obtain the following inequality: there are absolute constants c, C such that for any r\ > 0, 

(22) P(|||p|| P PT-M d 2 iP | >r?) < C exp{-cp) + C exp(-cpr] 2 ). 

Let Wfp )P be a (d 2 ,p)-Wishart matrix. It follows from Theorem [2] that A m i n (W^ ) converges in proba- 
bility towards 1 — 2/^4 — e when d,p tend to infinity. By (j21]l . Tr W d 2 p /d 2 converges in probability to 1. 
Since W^^/ Tr W^ )P has distribution p d 2 ,pi ^ follows that ||/Orf2 jP ||ppT converges to , 2 = . In particular, 



IpIIppt = inf^^O : - + - I p - - ) is PPT 



2 

p'd— >oo a ' P -y/4 



lim M d 2 p = —p== > 1. 



We now choose 77 such that 2/\/4 — i > 1 + 77. For d,p large enough, we have M d 2 p > 1 + 77, and we 
can apply f)22|) to obtain 

P(p is PPT) = P(|H|ppt < 1) < Cexp(-cp) + Cexp(-cpr] 2 ). 

This concludes the proof of Theorem [4] (small dimensions can be taken into account by adjusting the 
constants). □ 
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9. Miscellaneous remarks 

9.1. Partial transposition of a random pure state. Another natural question from the point of view 
of Quantum Information Theory is to study the partial transposition of random pure states (as opposed 
to random mixed states considered here). In that direction, one may prove the following result 

Proposition 9.1. For every d, let p^ be a random pure state on C d (g> C d , with uniform distribution. 
Then, when d tends to infinity, the empirical eigenvalue distribution of dp^ approaches a deterministic 
distribution which can be described as the distribution of the product of two independent SC(0, 1) random 
variables. 

Remark. The notion of convergence used is the same as in Theorem^ The limiting distribution appearing 
in Proposition ^. 1\ has vanishing odd moments and even moments equal to the square of Catalan numbers. 
Such a distribution has been studied recently in [9j, where a closed formula for the density (involving special 
functions) is derived. 

Proof of Proposition (sketch). If p = \tp)(tp\ is a pure state on C d (g) C d , the eigenvalues of p r can be 
described from the Schmidt coefficients of ip (Schmidt coefficients for tensors correspond to singular values 
for matrices, and are therefore governed by the Marcenko-Pastur distribution). Indeed, given a Schmidt 
decomposition 

d 

Ip = ^ \Ai e i ® fi, 
i=l 

for some orthonormal bases (ej), (fi), one checks that 

d 

lV')(V'| r = '%2V*i*}\ei<8>fj){ej®fi\. 

It follows that the eigenvalues of \ip)(ip\ r are 

{Aj for every 1 ^ i ^ d, 

zk^XiXj for every 1 ^ i < j ^ d. 

Eigenvalues of the first category do not contribute to the limit distribution, and the result follows with 
little effort. □ 

9.2. Unbalanced bipartite systems. We may apply partial transposition to any decomposition C d2 ~ 
C dl (8> C d2 , with d\d2 = d 2 . Provided the ratio d\/d2 stays away from and oo, Theorems [1] and [2] remain 

valid. The point is that the main contributions come from terms in which a ~ b, so that df a d^ h depends 
only on the product d\d<i- 

9.3. Connexions to free probability. The same model of partially transposed Wishart matrices has 
been considered recently by Banica and Nechita [6] in a different asymptotic regime (when d\ is fixed and c?2 
goes to infinity). For that regime the picture is different: they obtain that the limit spectral distribution 
can be described as the difference of two freely independent random variables with Marcenko-Pastur 
distributions. The shifted semicircle distribution appears then as a limit case. We refer to for more 
information. 
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9.4. Uniform mixtures of random pure states. There is another popular model of random states 
which is very similar to the model of random induced states considered in Section El for which our 
results are also valid. Let (V'j)i^j^p be unit vectors in C n , chosen independently according to the uniform 
probability measure on the the unit sphere. Then we consider the random state 

1 P 

Denote by v n ^ v the distribution of p. This model of random states has been considered for example in 
[23J. When n,p are large, the probability measures fx UjP and v n<p behave similarly. It can be shown that 
Theorems [3] and [4] remain valid when the probability measures \x n<p are substituted by the probability 
measures v np . 

9.5. Volume of the PPT convex body. How many states have a positive partial transpose ? This 
question may be formulated using the Lebesgue measure (or "volume") induced by the Hilbert-Schmidt 
scalar product, or equivalently (cf Proposition 18. ip by the induced measure over an ancilla of equal di- 
mension. Let Wd be a (d 2 , <i 2 )-Wishart random matrix. It was shown in [2] (formulated as a lower bound 
on the volume of the set of PPT states, and using techniques from high-dimensional convexity) that for 
some constant C > 

(23) P(Wj ^ 0) ^ exp(-Cd 4 ). 

By Theorem |2j the probability on the left-hand side tends to when d tends to +oo. How fast it goes to 
zero is actually a question about large deviations. For standard models of random matrices, very precise 
results are known about large deviations (see e.g. [1], Section 2.6.2), and one may expect the lower bound 
from (|23p to be sharp. 

Conjecture. There is an absolute constant c > such that, whenever Wd is a (d 2 ,d 2 )-Wishart matrix, 

P(Wj ^ 0) < exp(-cd 4 ). 
This would quantify precisely how (un)common are PPT states in large dimensions. 
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